792 research outputs found

    Scalable Audience Reach Estimation in Real-time Online Advertising

    Full text link
    Online advertising has been introduced as one of the most efficient methods of advertising throughout the recent years. Yet, advertisers are concerned about the efficiency of their online advertising campaigns and consequently, would like to restrict their ad impressions to certain websites and/or certain groups of audience. These restrictions, known as targeting criteria, limit the reachability for better performance. This trade-off between reachability and performance illustrates a need for a forecasting system that can quickly predict/estimate (with good accuracy) this trade-off. Designing such a system is challenging due to (a) the huge amount of data to process, and, (b) the need for fast and accurate estimates. In this paper, we propose a distributed fault tolerant system that can generate such estimates fast with good accuracy. The main idea is to keep a small representative sample in memory across multiple machines and formulate the forecasting problem as queries against the sample. The key challenge is to find the best strata across the past data, perform multivariate stratified sampling while ensuring fuzzy fall-back to cover the small minorities. Our results show a significant improvement over the uniform and simple stratified sampling strategies which are currently widely used in the industry

    High-dimensional Sparse Inverse Covariance Estimation using Greedy Methods

    Full text link
    In this paper we consider the task of estimating the non-zero pattern of the sparse inverse covariance matrix of a zero-mean Gaussian random vector from a set of iid samples. Note that this is also equivalent to recovering the underlying graph structure of a sparse Gaussian Markov Random Field (GMRF). We present two novel greedy approaches to solving this problem. The first estimates the non-zero covariates of the overall inverse covariance matrix using a series of global forward and backward greedy steps. The second estimates the neighborhood of each node in the graph separately, again using greedy forward and backward steps, and combines the intermediate neighborhoods to form an overall estimate. The principal contribution of this paper is a rigorous analysis of the sparsistency, or consistency in recovering the sparsity pattern of the inverse covariance matrix. Surprisingly, we show that both the local and global greedy methods learn the full structure of the model with high probability given just O(dlog⁑(p))O(d\log(p)) samples, which is a \emph{significant} improvement over state of the art β„“1\ell_1-regularized Gaussian MLE (Graphical Lasso) that requires O(d2log⁑(p))O(d^2\log(p)) samples. Moreover, the restricted eigenvalue and smoothness conditions imposed by our greedy methods are much weaker than the strong irrepresentable conditions required by the β„“1\ell_1-regularization based methods. We corroborate our results with extensive simulations and examples, comparing our local and global greedy methods to the β„“1\ell_1-regularized Gaussian MLE as well as the Neighborhood Greedy method to that of nodewise β„“1\ell_1-regularized linear regression (Neighborhood Lasso).Comment: Accepted to AI STAT 2012 for Oral Presentatio
    • …
    corecore